Researchers at the Hebrew University of Jerusalem recently discovered that in Retrieval Augmented Generation (RAG) systems, the number of documents processed significantly impacts language model performance, even when the total text length remains constant. The research team conducted experiments using 2,417 questions from the MuSiQue validation dataset, each linked to 20 Wikipedia paragraphs. Two to four paragraphs contained relevant answer information, with the remaining paragraphs serving as distractors. To study the impact of the number of documents, the team created multiple data partitions, progressively reducing the number of documents from 20 to...